Time Series Analysis
Toward Reasoning-Centric Time-Series Analysis
Wang, Xinlei, Tan, Mingtian, Qiu, Jing, Zhao, Junhua, Gu, Jinjin
Abstract--T raditional time series analysis has long relied on pattern recognition, trained on static and well-established benchmarks. However, in real-world settings - where policies shift, human behavior adapts, and unexpected events unfold - effective analysis must go beyond surface-level trends to uncover the actual forces driving them. The recent rise of Large Language Models (LLMs) presents new opportunities for rethinking time series analysis by integrating multimodal inputs. However, as the use of LLMs becomes popular, we must remain cautious, asking why we use LLMs and how to exploit them effectively . Most existing LLM-based methods still employ their numerical regression ability and ignore their deeper reasoning potential. This paper argues for rethinking time series with LLMs as a reasoning task that prioritizes causal structure and explainability . This shift brings time series analysis closer to human-aligned understanding, enabling transparent and context-aware insights in complex real-world environments. Time series analysis has traditionally been framed as a pattern recognition problem, extracting trends and correlations from observed data.
MLLM4TS: Leveraging Vision and Multimodal Language Models for General Time-Series Analysis
Liu, Qinghua, Heshmati, Sam, Mai, Zheda, Abraham, Zubin, Paparrizos, John, Ren, Liu
Effective analysis of time series data presents significant challenges due to the complex temporal dependencies and cross-channel interactions in multivariate data. Inspired by the way human analysts visually inspect time series to uncover hidden patterns, we ask: can incorporating visual representations enhance automated time-series analysis? Recent advances in multimodal large language models have demonstrated impressive generalization and visual understanding capability, yet their application to time series remains constrained by the modality gap between continuous numerical data and discrete natural language. To bridge this gap, we introduce MLLM4TS, a novel framework that leverages multimodal large language models for general time-series analysis by integrating a dedicated vision branch. Each time-series channel is rendered as a horizontally stacked color-coded line plot in one composite image to capture spatial dependencies across channels, and a temporal-aware visual patch alignment strategy then aligns visual patches with their corresponding time segments. MLLM4TS fuses fine-grained temporal details from the numerical data with global contextual information derived from the visual representation, providing a unified foundation for multimodal time-series analysis. Extensive experiments on standard benchmarks demonstrate the effectiveness of MLLM4TS across both predictive tasks (e.g., classification) and generative tasks (e.g., anomaly detection and forecasting). These results underscore the potential of integrating visual modalities with pretrained language models to achieve robust and generalizable time-series analysis.
MONAQ: Multi-Objective Neural Architecture Querying for Time-Series Analysis on Resource-Constrained Devices
The growing use of smartphones and IoT devices necessitates efficient time-series analysis on resource-constrained hardware, which is critical for sensing applications such as human activity recognition and air quality prediction. Recent efforts in hardware-aware neural architecture search (NAS) automate architecture discovery for specific platforms; however, none focus on general time-series analysis with edge deployment. Leveraging the problem-solving and reasoning capabilities of large language models (LLM), we propose MONAQ, a novel framework that reformulates NAS into Multi-Objective Neural Architecture Querying tasks. MONAQ is equipped with multimodal query generation for processing multimodal time-series inputs and hardware constraints, alongside an LLM agent-based multi-objective search to achieve deployment-ready models via code generation. By integrating numerical data, time-series images, and textual descriptions, MONAQ improves an LLM's understanding of time-series data. Experiments on fifteen datasets demonstrate that MONAQ-discovered models outperform both handcrafted models and NAS baselines while being more efficient.
Eliciting Chain-of-Thought Reasoning for Time Series Analysis using Reinforcement Learning
Parker, Felix, Chan, Nimeesha, Zhang, Chi, Ghobadi, Kimia
Complex numerical time series analysis often demands multi-step reasoning capabilities beyond current models' reach. Tasks like medical diagnosis and weather forecasting require sequential reasoning processes -- including counterfactual analysis, logical deduction, knowledge application, and multi-modal contextual integration -- that existing time series models cannot explicitly perform. While recent research has shown large language models (LLMs) can achieve sophisticated Chain-of-Thought (CoT) reasoning through reinforcement learning (RL), these advances have primarily focused on mathematical and coding domains, with LLMs still demonstrating poor performance on time series tasks. We introduce Chain Of thought for Understanding Numerical Time Series (COUNTS), the first framework that trains LLMs to perform CoT reasoning across diverse time series tasks using RL with verifiable rewards. Our approach employs a Residual Vector-Quantized VAE to create high-fidelity discrete tokens that seamlessly integrate into a pre-trained LLM's vocabulary. COUNTS undergoes a two-stage training process: first, supervised fine-tuning on time series analysis tasks to master our novel representations, followed by Group Relative Policy Optimization training on verifiable problems using prompting strategies that encourage explicit reasoning steps before producing final answers. Our experiments demonstrate that this RL-driven approach with intermediate CoT reasoning significantly enhances LLM performance across various time series analysis tasks, opening new possibilities for complex temporal data reasoning.
Interpretable time series analysis with Gumbel dynamics
Wang, Yiliu, Kim, Timothy Doyeon, Shea-Brown, Eric, Sรผmbรผl, Uygar
Switching dynamical systems can model complicated time series data while maintaining interpretability by inferring a finite set of dynamics primitives and explaining different portions of the observed time series with one of these primitives. However, due to the discrete nature of this set, such models struggle to capture smooth, variable-speed transitions, as well as stochastic mixtures of overlapping states, and the inferred dynamics often display spurious rapid switching on real-world datasets. Here, we propose the Gumbel Dynamical Model (GDM). First, by introducing a continuous relaxation of discrete states and a different noise model defined on the relaxed-discrete state space via the Gumbel distribution, GDM expands the set of available state dynamics, allowing the model to approximate smoother and non-stationary ground-truth dynamics more faithfully. Second, the relaxation makes the model fully differentiable, enabling fast and scalable training with standard gradient descent methods. We validate our approach on standard simulation datasets and highlight its ability to model soft, sticky states and transitions in a stochastic setting. Furthermore, we apply our model to two real-world datasets, demonstrating its ability to infer interpretable states in stochastic time series with multiple dynamics, a setting where traditional methods often fail.
Multi-View Contrastive Learning for Robust Domain Adaptation in Medical Time Series Analysis
Adapting machine learning models to medical time series across different domains remains a challenge due to complex temporal dependencies and dynamic distribution shifts. Current approaches often focus on isolated feature representations, limiting their ability to fully capture the intricate temporal dynamics necessary for robust domain adaptation. In this work, we propose a novel framework leveraging multi-view contrastive learning to integrate temporal patterns, derivative-based dynamics, and frequency-domain features. Our method employs independent encoders and a hierarchical fusion mechanism to learn feature-invariant representations that are transferable across domains while preserving temporal coherence. Extensive experiments on diverse medical datasets, including electroencephalogram (EEG), electrocardiogram (ECG), and electromyography (EMG) demonstrate that our approach significantly outperforms state-of-the-art methods in transfer learning tasks. By advancing the robustness and generalizability of machine learning models, our framework offers a practical pathway for deploying reliable AI systems in diverse healthcare settings. Data and Code Availability This study uses publicly available datasets in medical and healthcare domains, including SleepEEG (Kemp et al., 2000) and ECG (Clifford et al., 2017) for pre-training, and Epilepsy (Andrzejak et al., 2001), FD (Less-meier et al., 2016), Gesture (Liu et al., 2009), and EMG (Goldberger et al., 2000) for fine-tuning. The datasets used in this study are publicly accessible via their respective repositories, with detailed documentation included in the supplementary material.
JustDense: Just using Dense instead of Sequence Mixer for Time Series analysis
Park, TaekHyun, Lee, Yongjae, Park, Daesan, Kim, Dohee, Bae, Hyerim
Sequence and channel mixers, the core mechanism in sequence models, have become the de facto standard in time series analysis (TSA). However, recent studies have questioned the necessity of complex sequence mixers, such as attention mechanisms, demonstrating that simpler architectures can achieve comparable or even superior performance. This suggests that the benefits attributed to complex sequencemixers might instead emerge from other architectural or optimization factors. Based on this observation, we pose a central question: Are common sequence mixers necessary for time-series analysis? Therefore, we propose JustDense, an empirical study that systematically replaces sequence mixers in various well-established TSA models with dense layers. Grounded in the MatrixMixer framework, JustDense treats any sequence mixer as a mixing matrix and replaces it with a dense layer. This substitution isolates the mixing operation, enabling a clear theoretical foundation for understanding its role. Therefore, we conducted extensive experiments on 29 benchmarks covering five representative TSA tasks using seven state-of-the-art TSA models to address our research question. The results show that replacing sequence mixers with dense layers yields comparable or even superior performance. In the cases where dedicated sequence mixers still offer benefits, JustDense challenges the assumption that "deeper and more complex architectures are inherently better" in TSA.
HGTS-Former: Hierarchical HyperGraph Transformer for Multivariate Time Series Analysis
Wang, Xiao, Si, Hao, Zhang, Fan, Zhou, Xiaoya, Sun, Dengdi, Lyu, Wanli, Yang, Qingquan, Tang, Jin
Multivariate time series analysis has long been one of the key research topics in the field of artificial intelligence. However, analyzing complex time series data remains a challenging and unresolved problem due to its high dimensionality, dynamic nature, and complex interactions among variables. Inspired by the strong structural modeling capability of hypergraphs, this paper proposes a novel hypergraph-based time series transformer backbone network, termed HGTS-Former, to address the multivariate coupling in time series data. Specifically, given the multivariate time series signal, we first normalize and embed each patch into tokens. Then, we adopt the multi-head self-attention to enhance the temporal representation of each patch. The hierarchical hypergraphs are constructed to aggregate the temporal patterns within each channel and fine-grained relations between different variables. After that, we convert the hyperedge into node features through the EdgeToNode module and adopt the feed-forward network to further enhance the output features. Extensive experiments conducted on two multivariate time series tasks and eight datasets fully validated the effectiveness of our proposed HGTS-Former. The source code will be released on https://github.com/Event-AHU/Time_Series_Analysis.
Failure Risk Prediction in a MOOC: A Multivariate Time Series Analysis Approach
Ayady, Anass El, Devanne, Maxime, Forestier, Germain, Mawas, Nour El
MOOCs offer free and open access to a wide audience, but completion rates remain low, often due to a lack of personalized content. To address this issue, it is essential to predict learner performance in order to provide tailored feedback. Behavioral traces-such as clicks and events-can be analyzed as time series to anticipate learners' outcomes. This work compares multivariate time series classification methods to identify at-risk learners at different stages of the course (after 5, 10 weeks, etc.). The experimental evaluation, conducted on the Open University Learning Analytics Dataset (OULAD), focuses on three courses: two in STEM and one in SHS. Preliminary results show that the evaluated approaches are promising for predicting learner failure in MOOCs. The analysis also suggests that prediction accuracy is influenced by the amount of recorded interactions, highlighting the importance of rich and diverse behavioral data.
From Images to Signals: Are Large Vision Models Useful for Time Series Analysis?
Zhao, Ziming, Shen, ChengAo, Tong, Hanghang, Song, Dongjin, Deng, Zhigang, Wen, Qingsong, Ni, Jingchao
Transformer-based models have gained increasing attention in time series research, driving interest in Large Language Models (LLMs) and foundation models for time series analysis. As the field moves toward multi-modality, Large Vision Models (LVMs) are emerging as a promising direction. In the past, the effectiveness of Transformer and LLMs in time series has been debated. When it comes to LVMs, a similar question arises: are LVMs truely useful for time series analysis? To address it, we design and conduct the first principled study involving 4 LVMs, 8 imaging methods, 18 datasets and 26 baselines across both high-level (classification) and low-level (forecasting) tasks, with extensive ablation analysis. Our findings indicate LVMs are indeed useful for time series classification but face challenges in forecasting. Although effective, the contemporary best LVM forecasters are limited to specific types of LVMs and imaging methods, exhibit a bias toward forecasting periods, and have limited ability to utilize long look-back windows. We hope our findings could serve as a cornerstone for future research on LVM- and multimodal-based solutions to different time series tasks.